DARL: Distance-Aware Uncertainty Estimation for Offline Reinforcement Learning
نویسندگان
چکیده
To facilitate offline reinforcement learning, uncertainty estimation is commonly used to detect out-of-distribution data. By inspecting, we show that current explicit estimators such as Monte Carlo Dropout and model ensemble are not competent provide trustworthy in learning. Accordingly, propose a non-parametric distance-aware estimator which sensitive the change input space for Based on our new estimator, adaptive truncated quantile critics proposed underestimate samples. We able offer better compared previous methods. Experimental results demonstrate DARL method competitive state-of-the-art methods evaluation tasks.
منابع مشابه
Uncertainty-Aware Reinforcement Learning for Collision Avoidance
Reinforcement learning can enable complex, adaptive behavior to be learned automatically for autonomous robotic platforms. However, practical deployment of reinforcement learning methods must contend with the fact that the training process itself can be unsafe for the robot. In this paper, we consider the specific case of a mobile robot learning to navigate an a priori unknown environment while...
متن کاملDirect Uncertainty Estimation in Reinforcement Learning
Optimal probabilistic approach in reinforcement learning is computationally infeasible. Its simplification consisting in neglecting difference between true environment and its model estimated using limited number of observations causes exploration vs exploitation problem. Uncertainty can be expressed in terms of a probability distribution over the space of environment models, and this uncertain...
متن کاملOffline Evaluation of Online Reinforcement Learning Algorithms
In many real-world reinforcement learning problems, we have access to an existing dataset and would like to use it to evaluate various learning approaches. Typically, one would prefer not to deploy a fixed policy, but rather an algorithm that learns to improve its behavior as it gains more experience. Therefore, we seek to evaluate how a proposed algorithm learns in our environment, meaning we ...
متن کاملDistance-Aware Beamforming for Multiuser Secure Communication Systems
Typical cryptography schemes are not well suited for low complexity types of equipment, e.g., Internet of things (IoT) devices, as they may need high power or impose high computational complexity on the device. Physical (PHY) layer security techniques such as beamforming (in multiple antennas systems) are possible alternatives to provide security for such applications. In this paper, we consid...
متن کاملValue-Aware Loss Function for Model Learning in Reinforcement Learning
We consider the problem of estimating the transition probability kernel to be used by a model-based reinforcement learning (RL) algorithm. We argue that estimating a generative model that minimizes a probabilistic loss, such as the log-loss, might be an overkill because such a probabilistic loss does not take into account the underlying structure of the decision problem and the RL algorithm tha...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i9.26327